Syllable-Based Speech Recognition for Amharic

نویسندگان

  • Solomon Teferra Abate
  • Wolfgang Menzel
چکیده

Amharic is the Semitic language that has the second large number of speakers after Arabic (Hayward and Richard 1999). Its writing system is syllabic with Consonant-Vowel (CV) syllable structure. Amharic orthography has more or less a one to one correspondence with syllabic sounds. We have used this feature of Amharic to develop a CV syllable-based speech recognizer, using Hidden Markov Modeling (HMM), and achieved 90.43% word recognition accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Grapheme Based Dictionaries for Speech Recognition∗

This report explores the potential of grapheme as a modeling unit for acoustic modeling in Amharic, Tamil and Telugu. While the three languages are considered phonetic, Amharic has an exception where in syllables of the form /CVC/ may be orthographically represented as /CC/. Here, the context determines the identity of the vowel within the syllable. We employ a transcription correction model to...

متن کامل

Syllable-based and hybrid acoustic models for Amharic speech recognition

This paper presents the results of our experiments on the use of hybrid acoustic units in speech recognition and the use of syllable and hybrid acoustic models (AM) in morphemebased speech recognition. Although hybrid AMs did not bring improvement in speech recognition performance when words are used as dictionary entries and units in a language model (LM), we observed a significant word error ...

متن کامل

Analyse des performances de modèles de langage sub-lexicale pour des langues peu-dotées à morphologie riche

Performance analysis of sub-word language modeling for under-resourced languages with rich morphology : case study on Swahili and Amharic This paper investigates the impact on ASR performance of sub-word units for two underresourced african languages with rich morphology (Amharic and Swahili). Two subword units are considered : syllable and morpheme, the latter being obtained in a supervised or...

متن کامل

Analyse des performances de modèles de langage sub-lexicale pour des langues peu-dotées à morphologie riche (Performance analysis of sub-word language modeling for under-resourced languages with rich morphology: case study on Swahili and Amharic) [in French]

Performance analysis of sub-word language modeling for under-resourced languages with rich morphology : case study on Swahili and Amharic This paper investigates the impact on ASR performance of sub-word units for two underresourced african languages with rich morphology (Amharic and Swahili). Two subword units are considered : syllable and morpheme, the latter being obtained in a supervised or...

متن کامل

Automatic speech recognition for an under-resourced language - amharic

In this paper we present the development of an Automatic Speech Recognition System (ASRS) for Amharic using limited available resources and the freely available speech toolkit (HTK). There are phonological, dialectal, orthographic and morphological features of Amharic that challenge the development of ASRSs. The problem of resource scarcity is also a hindrance to the research and development in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007